Chinese Short Text Classification by ERNIE Based on LTC_Block

نویسندگان

چکیده

Short text classification, an important direction of the basic research natural language processing, has extensive applications. Its effect depends on feature extraction methods and representation methods. This paper proposed LTC_Block-based short classification model named ERNIE to classify Chinese texts extract semantics in corpus address polysemy problem text. In this model, LTC_Block, a double-channel structural unit composed BiLSTM TextCNN, was used contextual sequences overall features semantics, residual connection integrate further texts. Experiments two different datasets showed that achieved better than mainstream models, proving its feasibility effectiveness.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Chinese Short Text Classification Based on Domain Knowledge

People are generating more and more short texts. There is an urgent demand to classify short texts into different domains. Due to the shortness and sparseness of short texts, conventional methods based on Vector Space Model (VSM) have limitations. To tackle the data scarcity problem, we propose a new model to directly measure the correlation between a short text instance and a domain instead of...

متن کامل

Short Text Classification Based on Improved ITC

The long text classification has got great achievements, but short text classification still needs to be perfected. In this paper, at first, we describe why we select the ITC feature selection algorithm not the conventional TFIDF and the superiority of the ITC compared with the TFIDF, then we conclude the flaws of the conventional ITC algorithm, and then we present an improved ITC feature selec...

متن کامل

Chinese Short-Text Classification Based on Topic Model with High-Frequency Feature Expansion

Short text differs from traditional documents in its shortness and sparseness. Feature extension can ease the problem of high sparseness in the vector space model, but it inevitably introduces noise. To resolve this problem, this paper proposes a high-frequency feature expansion method based on a latent Dirichlet allocation (LDA) topic model. High-frequency features are extracted from each cate...

متن کامل

Short Text Classification on Complaint Documents

Indonesian government has developed a system for citizens to voice their aspirations and complaints, which are then stored in the form of short documents. Unfortunately, the existing system employs human annotators to manually categorize the short documents, which is very expensive and time-consuming. As a result, automatically classifying the short documents into their correct topics will redu...

متن کامل

An Arguing Lexicon for Stance Classification on Short Text Comments in Chinese

With the development of social media and online forums, users have grown accustomed to expressing their agreement and disagreement via short texts. Elements that reveal the user’s stance or subjectivity thus becomes an important resource in identifying the user’s position on a given topic. In the current study, we observe comments of an online bulletin board in Taiwan for how people express the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Wireless Communications and Mobile Computing

سال: 2022

ISSN: ['1530-8669', '1530-8677']

DOI: https://doi.org/10.1155/2022/1411744